Networking Multiword Units

نویسندگان

  • Matthieu Constant
  • Patrick Watrin
چکیده

This paper details a network infrastructure for representing and sharing multiword units. It enables connecting local networks describing linguistic semi-fixed components in the form of local grammars.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On multiword lexical units and their role in maritime dictionaries

Multi-word lexical units are a typical feature of specialized dictionaries, in particular monolingual and bilingual maritime dictionaries. The paper studies the concept of the multi-word lexical unit and considers the similarities and differences of their selection and presentation in monolingual and bilingual maritime dictionaries. The work analyses such issues as the classification of multi-w...

متن کامل

Multilingual Aspects of Multiword Lexical Units

As most of the machine-readable dictionaries contain clearly insufficient information about multiword lexical units, there is a constant need to extend and tune specialized lexical databases to account for new expressions. In this paper, we present a system exclusively based on statistics that massively extracts from unrestricted text corpora contiguous and noncontiguous rigid multiword lexical...

متن کامل

A Parallel Multikey Quicksort Algorithm for Mining Multiword Units

In the context of word associations, multiword units (sequences of words that co-occur more often than expected by chance) are frequently used in everyday language, usually to precisely express ideas and concepts that cannot be compressed into a single word. For instance, [Bill of Rights], [swimming pool], [as well as], [in order to], [to comply with] or [to put forward] are multiword units. As...

متن کامل

Unsupervised Multiword Segmentation of Large Corpora using Prediction-Driven Decomposition of n-grams

We present a new, efficient unsupervised approach to the segmentation of corpora into multiword units. Our method involves initial decomposition of common n-grams into segments which maximize within-segment predictability of words, and then further refinement of these segments into a multiword lexicon. Evaluating in four large, distinct corpora, we show that this method creates segments which c...

متن کامل

More Than Words: The Role of Multiword Sequences in Language Learning and Use

The ability to convey our thoughts using an infinite number of linguistic expressions is one of the hallmarks of human language. Understanding the nature of the psychological mechanisms and representations that give rise to this unique productivity is a fundamental goal for the cognitive sciences. A long-standing hypothesis is that single words and rules form the basic building blocks of lingui...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008